SemScribe: Natural Language Generation for Medical Reports
نویسندگان
چکیده
Natural language generation in the medical domain is heavily influenced by domain knowledge and genre-specific text characteristics. We present SemScribe, an implemented natural language generation system that produces doctor’s letters, in particular descriptions of cardiological findings. Texts in this domain are characterized by a high density of information and a relatively telegraphic style. Domain knowledge is encoded in a medical ontology of about 80,000 concepts. The ontology is used in particular for concept generalizations during referring expression generation. Architecturally, the system is a generation pipeline that uses a corpus-informed syntactic frame approach for realizing sentences appropriate to the domain. The system reads XML documents conforming to the HL7 Clinical Document Architecture (CDA) Standard and enhances them with generated text and references to the used data elements. We conducted a first clinical trial evaluation with medical staff and report on the findings.
منابع مشابه
Lexical Parameters, Based on Corpus Analysis of English and Swedish Cancer Data, of Relevance for NLG
This paper reports on a corpus-based, contrastive study of the Swedish and English medical language in the cancer sub-domain. It is focused on the examination of a number of linguistic parameters differentiating two types of cancer-related textual material, one intended for medical experts and one for laymen. Language-dependent and language independent characteristics of the textual data betwee...
متن کاملNatural Language Generation in Healthcare: Brief Review
Good communication is vital in healthcare, both among healthcare professionals, and between healthcare professionals and their patients. And well-written documents, describing and/or explaining the information in structured databases may be easier to comprehend, more edifying and even more convincing, than the structured data, even when presented in tabular or graphic form. Documents may be aut...
متن کاملStructural variation in generated health reports
We present a natural language generator that produces a range of medical reports on the clinical histories of cancer patients, and discuss the problem of conceptual restatement in generating various textual views of the same conceptual content. We focus on two features of our system: the demand for “loose paraphrases” between the various reports on a given patient, with a high degree of semanti...
متن کاملUsing Knowledge Sources to Improve Classification of Medical Text Reports
Domain knowledge has been shown to be an important component of machine learning. However, the cost of obtaining domain knowledge to improve classifier generation can exceed the cost of manually creating classifiers. An alternative approach is to use existing knowledge sources to collect relevant domain knowledge, and improve machine learning. We investigated the use of two existing knowledge s...
متن کاملAn application design supporting structured radiology reports
Radiologists use medical images to diagnose diseases. Today, medical images are commonly stored and displayed electronically. The radiologist describes the image interpretations in a report, and the electronic viewing of images opens a possibility for the radiology report to become computerized. By creating an application that supports the radiologist’s report generation, tools for creation of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012